-
Notifications
You must be signed in to change notification settings - Fork 565
feat(content_safety): add support to auto select multilingual refusal bot messages #1530
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
…age support Detect user input language and return refusal messages in the same language when content safety rails block unsafe content. Supports 9 languages: English, Spanish, Chinese, German, French, Hindi, Japanese, Arabic, and Thai.
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
| DEFAULT_REFUSAL_MESSAGES: Dict[str, str] = { | ||
| "en": "I'm sorry, I can't respond to that.", | ||
| "es": "Lo siento, no puedo responder a eso.", | ||
| "zh": "抱歉,我无法回应。", | ||
| "de": "Es tut mir leid, darauf kann ich nicht antworten.", | ||
| "fr": "Je suis désolé, je ne peux pas répondre à cela.", | ||
| "hi": "मुझे खेद है, मैं इसका जवाब नहीं दे सकता।", | ||
| "ja": "申し訳ありませんが、それには回答できません。", | ||
| "ar": "عذراً، لا أستطيع الرد على ذلك.", | ||
| "th": "ขออภัย ฉันไม่สามารถตอบได้", | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we later had other multilingual rails, would we be repeating this mechanism in each rail? Or just the set of supported languages per rail? I don't think we need to do it now (since we don't have other multilingual rails to test it), but we should be aware of what refactoring would be needed to move the below language detection to a shared level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of course, we can relax this constraint later and allow users more flexibility. Once we need to support other models or other types of rails (beyond content safety) that require multilingual responses, we can:
- Move the
detect_languageaction fromlibrary/content_safety/actions.pyto a shared location (nemoguardrails/actions/) making it available to all rails - It was possible to also introduce a Colang level abstraction like
bot refuse to respond $multilang=true, could be done easily for Colang 2.0, but I think it is better if we don't add new Colang features for now.
I agree, for now, keeping it scoped to content safety keeps the implementation focused.
| try: | ||
| from fast_langdetect import detect | ||
|
|
||
| result = detect(text, k=1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does fast-langdetect ever return a full locale with dialect, like en-US versus en? I don't see it in the docs, but I do see some upper/lowercase inconsistency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fair point, thanks for raising it. Just took a closer look: the fast-langdetect source code and fastText model behavior.
- fast-langdetect README mentions BCP-47 tags like
"zh-cn","pt-br" - but the fastText lid.176.bin model uses simple ISO 639 codes:
zh,pt,en, etc. - fast-langdetect source simply strips
__label__prefix from fastText output, no regional mapping is applied
validated with actual test:
>>> detect("抱歉,我无法处理该请求", k=2)
[{'lang': 'zh', 'score': 0.80}, {'lang': 'ta', 'score': 0.08}]returns "zh", NOT "zh-cn".
So no regional variant handling needed.
|
This looks really good @Pouyanpi ! I have a few comments:
Not needed in this PR, but I'm thinking of RAG prompts where we have LLM instructions, user query, and relevant context chunks are all in a flattened prompt. These prompts can be pretty long (up to 7k tokens in some cases). This isn't needed for this PR, but I would be interested in a follow-on where we sample part of a prompt before running classification on the sample (e.g. 200 chars). This would be an optional config field. Customers would then have a knob to trade off accuracy vs latency for language detection. |
I've included them temp/lang-detect-benchmark branch to make review easier. If you find it easier I will do.
Yes
Yes, I would like to avoid adding colang level features as much as possible
Done! updated the description.
fast-langdetect already does the truncation by default but indeed we can give that flexibility to the users :
|
Why wouldn't we merge them into develop? It's best practice in ML to make any results reproducible, for which we need the input datasets and scripts. The datasets are public and linked above. I'd imagine we'll have to re-run evals for new languages as they're added to the content-safety and other models. So we'll run this script periodically.
Was that measured at a concurrency of 1? Having a 100% overhead for each language inference is a lot higher than I'd expect. We don't need to fix it in this PR.
+1
Could you check? I didn't see any length description.
Could you add optional Pydantic fields for any of these values it makes sense to expose to users? Looking at the config I think |
Description
Detect user input language and return refusal messages in the same language when content safety rails block unsafe content. Supports 9 languages: English, Spanish, Chinese, German, French, Hindi, Japanese, Arabic, and Thai.
TODO:
Language Detection Benchmark Results
Datasets Used
Chinese samples in Nemotron are all REDACTED; Chinese coverage validated via papluca dataset.
Prompt Length Analysis (characters)
Note: fast-langdetect truncates input at 80 characters by default (
max_input_length=80), so longer prompts are effectively evaluated on their first 80 chars.Overall Accuracy comparison
Latency comparison (μs)
Per Language Accuracy (fast-langdetect)
Per-Language Accuracy (lingua)
Why fast-langdetect?
https://github.com/LlmKira/fast-langdetect
Error analysis
Most errors occur with:
The action correctly falls back to English (en) for unsupported detected languages.
Benchmark Scripts
checkout to temp/lang-detect-benchmark branch
Located in eval/language_detection/:
make sure to have datasets and pandas installed: